Spring 2026: Math 291 Daily Update

Tuesday, January 20

After reviewing class procedures, and items on the syllabus, we began a discussion of \(2\times 2\) matrices over \(\mathbb{R}\), the set of which we denote by \(\textrm{M}_2(\mathbb{R})\). Given \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), we identified the entries of \(A\) as follows: \(a\) is the (1,1) entry, \(b\) is the (1,2) entry, \(c\) is the (2,1) entry and \(d\) is the (2,2) entry.

We established two fundamental operations:

  1. (i) Matrix addition: Given \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(B = \begin{pmatrix} e & f\\g & h\end{pmatrix}\), define \(A+B := \begin{pmatrix} a+e & b+f\\c+g & d+h\end{pmatrix}\).
  2. (ii) Scalar multiplication: Given \(\lambda \in \mathbb{R}\) and \(A\in \textrm{M}_2(\mathbb{R})\), define \(\lambda \cdot A = \begin{pmatrix} \lambda a & \lambda b\\\lambda c & \lambda d\end{pmatrix}\).

We then discussed at length the following properties. In what follows, \(A, B, C\in \textrm{M}_2(\mathbb{R})\) and \(\lambda, \lambda_1, \lambda_2 \in \mathbb{R}\).

1. An additive identity exists: For \({\bf 0}_{2\times 2} := \begin{pmatrix} 0 & 0\\0 & 0\end{pmatrix}\), we have \({\bf 0}_{2\times 2}+A = A\), for all \(A\in \textrm{M}_2(\mathbb{R})\).

2. Additive inverses exist: Given \(-A := \begin{pmatrix} -a & -b\\-c & -d\end{pmatrix}\), we have \(-A+A = {\bf 0}_{2\times 2}\).

3. Addition is commutative: \(A+B = B+A\).

4. Addition is associative: \((A+B)+C = A+(B+C)\), for \(A, B, C\in \textrm{M}_2(\mathbb{R})\).

5. Scalar multiplication distributes over matrix addition: \(\lambda \cdot (A+B) = \lambda \cdot A+\lambda \cdot B\), for \(\lambda \in \mathbb{R}\).

6. Scalar addition is distributive: \((\lambda_1+\lambda_2)\cdot A = \lambda_1\cdot A+\lambda_2\cdot A\), for \(\lambda_1,\lambda_1\in \mathbb{R}\).

7. Scalar multiplication is associative: \((\lambda_1\lambda_2)\cdot A = \lambda _1\cdot (\lambda_2\cdot A)\).

8. \(1\cdot A = A\) and \(0\cdot A = {\bf 0}_{2\times 2}\).

We also discussed how one might prove these identities and noted that the properties above will be recurring throughout the semester as we discuss abstract vector spaces. We ended class by discussing the following consequences of properties (1)-(8) above. Keeping the same notation, we have

  1. (i) \(-1\cdot A = -A\).
  2. (ii) Additive inverses are unique, i.e., if \(A+C = {\bf 0}_{2\times 2}\), then \(C = -A\).
  3. (iii) Cancellation holds for matrix addition, i.e., if \(A+B = A+C\), then \(B = C\).
Thursday, January 22

In today's lecture we introduced two new operations for \(2\times 2\) matrices, namely multiplication of a matrix times a column and multiplication of two (\(2\times 2\)) matrices.

For \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(B = \begin{pmatrix} e & f\\g & h\end{pmatrix}\), \(C = \begin{pmatrix} u\\v\end{pmatrix}\), we defined

  1. (i) \(A\cdot C = \begin{pmatrix} a & b\\c & d\end{pmatrix} \cdot \begin{pmatrix} u\\v\end{pmatrix} := \begin{pmatrix} au+bv\\cu+dv\end{pmatrix}\).
  2. (ii) \(A\cdot B = \begin{pmatrix} a & b\\c & d\end{pmatrix}\cdot \begin{pmatrix} e & f\\g & h\end{pmatrix} = \begin{pmatrix} ae+bg & af+bh\\ce+dg & cf+dh\end{pmatrix}\).

We noted that the (1,1) entry of \(AB\) is \(R_1\cdot C_1\), the (1,2) entry is \(R_1\cdot C_2\), the (2,1) entry is \(R_2\cdot C_1\) and the (2,2) entry is \(R_2\cdot C_2\), where \(R_i\) is the \(i\)th row of \(A\) and \(C_j\) is the \(j\)th column of \(B\). We also noted that if we think of \(B\) as the matrix with columns \(C_1, C_2\), i.e., \(B = [C_1\ C_2]\), then \(AB = [AC_1\ AC_2]\), the matrix with columns \(AC_1, AC_2\).

We discussed how we can use the product of a matrix times a column to re-write a system of equations as a single matrix equation, as follows. Given the system of two equations in two unknowns

\[\begin{align*} ax+by &= u\\ cx+dy &= v, \end{align*}\]

We can write this as \(AX = L\), where \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(X = \begin{pmatrix} x\\y\end{pmatrix}\) and \(L = \begin{pmatrix} u\\v\end{pmatrix}\).

We then discussed powers of \(2\times 2\) matrices and the class calculated \(A^2, A^3, A^4\) and conjectured the value of \(A^{2026}\), for \(A = \begin{pmatrix} 1 & 0\\1 & 1\end{pmatrix}\). We were easily able to conjecture \(A^n = \begin{pmatrix} 1 & 0\\n & 1\end{pmatrix}\), for all \(n\geq 1\), which led to a discussion of how to use mathematical induction to prove this fact. One first establishes the base case \(n = 1\), which in this case is clear. One then shows that the \(n-1\) case implies the \(n\)th case - the inductive step, which in this case amounted to showing that \(A\cdot \begin{pmatrix} 1 & 0\\n-1 & 1\end{pmatrix} = \begin{pmatrix} 1 & 0\\n & 1\end{pmatrix} = A^n\). The class then used induction to prove the formula \(1+2+\cdots + n = \frac{n(n+1)}{2}\).

We moved on to discuss (but not prove) the following

Properties of matrix multiplication. Let \(A, B, C\) be \(2\times 2\) matrices.

  1. (i) \({\bf 0}_{2\times 2}\cdot A = {\bf 0}_{2\times 2} = A\cdot {\bf 0}_{2\times 2}\).
  2. (ii) For \(I_2 := \begin{pmatrix} 1 & 0\\0 & 1\end{pmatrix}\), \(A\cdot I_2 = A = I_2\cdot A\), i.e., a multiplicative identity exists.
  3. (iii) Multiplication distributes over matrix sums: \(A\cdot (B+C) = A\cdot B+A\cdot C\).
  4. (iv) Multiplication is associative: \(A(BC) = (AB)C\).
  5. (v) A matrix \(D\) satisfying \(AD = I_2 = DA\) is called an inverse of \(A\) and is denoted \(A^{-1}\).

We finished class by noting that if the matrix equation \(AX = L\) represents a system of equations (as above) and \(A\) has an inverse, then we can multiply both sides of the matrix equation by \(A^{-1}\) to get the solution \(X = A^{-1}L\).

Tuesday, January 27

We began class with the following definition. For the \(2\times 2\) matrix \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), the determinant of \(A\), denoted \(\det A\), equals \(ad-bc\).

We then discussed and verified the following

Properties of the determinant. Let \(A, B\) denote \(2\times 2\) matrices over \(\mathbb{R}\).

  1. (i) If \(A'\) is obtained from \(A\) by multiplying a row or column of \(A\) by \(\lambda \in \mathbb{R}\), then \(\det A' = \lambda \det A\).
  2. (ii) If \(A'\) is obtained from \(A\) by interchanging its rows or interchanging its columns, then \(\det A' = -\det A\).
  3. (iii) If \(A'\) is obtained from \(A\) by adding a multiple of one of its rows to another row, then \(\det A = \det A'\). Similarly for the columns of \(A\).

The operations (i)-(iii) are called elementary row or column operations.

  1. (iv) \(\det AB = \det A\cdot \det B\).
  2. (v) Suppose \(\det A \not = 0\). Set \(\Delta := \det A\). Then \(A^{-1}\) exists and we have \(A^{-1} = \begin{pmatrix} \frac{d}{\Delta} & -\frac{c}{\Delta}\\-\frac{b}{\Delta} & \frac{a}{\Delta}\end{pmatrix}\).
  3. (vi) Given vectors \(u = (a,b)\) and \(v = (c,d)\), the area of the parallelogram in \(\mathbb{R}^2\) spanned by \(u\) and \(v\) is \(|ad-bc|\), i.e., the absolute value of \(\det \begin{pmatrix} a & b\\c & d\end{pmatrix}\).

We ended class by looking at a typical system of two linear equations in two unknowns

\[\begin{align*} (1)\quad ax+by &= u\\ (2)\quad cx+dy &= v. \end{align*}\]

We noted that each equation corresponds to a straight line \(L_1, L_2\) (respectively) in \(\mathbb{R}^2\) and \((s,t)\) is a solution to the system if and only if \((s,t)\) is a point on each line. Thus the following are the only possibilities for the solution set to the given system of equations:

  1. (i) There is a unique solution. This occurs when \(L_1\) and \(L_2\) are not parallel, and thus intersect in a single point.
  2. (ii) There is no solution. This occurs when \(L_1\) and \(L_2\) are parallel.
  3. (iii) There are infinitely many solutions. This occurs when \(L_1 = L_2\), so that \((s,t)\) is a solution to the system if and only if it is a solution to the first (or second) equation.

Thus, there can never be a \(2\times 2\) system of linear equations with exactly 17 solutions! (Or with exactly \(n\) solutions for any \(n > 1\).)

Thursday, January 29

In the previous lecture we saw that given a system of linear equations

\[\begin{align*} ax+by &= u\\ cx+dy &= v \end{align*}\]

whose coefficient matrix \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\) has non-zero determinant \(\Delta\), then the solution to the system is given by

Cramer's Rule. For the system above, with \(\Delta \not = 0\), \(x = \frac{\det\begin{pmatrix} u & b\\v & d\end{pmatrix}}{\Delta}\) and \(y = \frac{\det\begin{pmatrix} a & u\\c & v\end{pmatrix}}{\Delta}\).

We noted that for large systems of linear equations, Cramer's rule is not cost effective, so we began a discussion of Gaussian elimination. We started with a specific system of equations (like)

\[\begin{align*} 2x+6y &= 8\\ 3x+y &= 4 \end{align*}\]

and performed a sequence of operations that changed the system, but preserved the solution. These operations were of the following form: Interchange equations, add a multiple of one equation to another equation, and multiply an equation by a non-zero number. This simplified the system to one trivially solvable, namely

\[\begin{align*} x &= 1\\ y &= 1. \end{align*}\]

We noted that in doing the various operations, the arithmetic involved the coefficients in the equations and the variables were essentially placeholders. This led to considering the corresponding augmented matrix \(\begin{bmatrix}2 & 6 & | & 8\\3 & 1 & | & 4\end{bmatrix}\). By performing the same operations on the rows of the augmented matrix that we did on the system of equations, this led to the augmented matrix \(\begin{bmatrix}1 & 0 & | & 1\\0 & 1 & | & 1\end{bmatrix}\), which corresponds to the system

\[\begin{align*} x &= 1\\ y &= 1. \end{align*}\]

We formalized this process by defining

Elementary Row Operations. Let \(A\) be a \(2\times 2\) matrix (or any matrix in fact). The following constitute elementary row operations:

  1. (i) Interchange two rows.
  2. (ii) Add a multiple of one row to another row.
  3. (iii) Multiply a row by a non-zero number.

We noted that the goal was to put, if possible, the beginning augmented matrix \(\begin{bmatrix}a & b & | & u\\c & d & | & v\end{bmatrix}\) into the form \(\begin{bmatrix}1 & 0 & | & s\\0 & 1 & | & t\end{bmatrix}\), from which the solution \(x = s, y = t\) could be read. We noted that the strategy for the Gaussian elimination process should be as follows: Using elementary row operations, first get a 1 in the (1,1) entry of the augmented matrix, then use that 1 to get 0 below it. Then make, if possible, the (2,2) entry of the augmented matrix 1 and then use that 1 to get a 0 above it. This will always be possible when the original system has a unique solution.

We ended class by considering the remaining two cases. In one case, the final augmented matrix took the form \(\begin{bmatrix}1 & 3 & | & 4\\0 & 0 & | & 0\end{bmatrix}\), which corresponds to the system \(x+3y = 4\), from which one concludes \(x = 4-3y\). To describe the solution set we introduced another parameter \(t\) to get \(\{(4-3t, t)\ |\ t\in \mathbb{R}\}\) as the solution set. This is the case when the system has infinitely many solutions. We then saw an example where the final augmented matrix took the form \(\begin{bmatrix}1 & 3 & | & 4\\0 & 0 & | & 1\end{bmatrix}\), which meant the original system had no solution, since \(0 = 1\) is a contradiction.

Tuesday, February 3

After reviewing the possible outcomes for solving a system of two linear equations in two unknowns using Gaussian elimination, we considered the following \(2\times 3\) system of equations

\[\begin{align*} 2x+5y+2z &= 9\\ x+2y-z &= 4. \end{align*}\]

As expected, we began with the augmented matrix \(\begin{bmatrix}2 & 5 & 2 & | & 9\\1 & 2 & -12 & | & 4\end{bmatrix}\), applied elementary row operations with the same strategy as in the \(2\times 2\) case and arrived at \(\begin{bmatrix}1 & 0 & -9 & | & 2\\0 & 1 & 4 & | & 1\end{bmatrix}\), from which we derived the solution set \(\{(2+9t,1-4t,t)\ |\ t\in \mathbb{R}\}\), which is a one parameter solution set.

We then considered the system

\[\begin{align*} 2x+4y-2z &= 8\\ x+2y-z &= 4. \end{align*}\]

Following the same procedure led to the augmented matrix \(\begin{bmatrix}1 & 2 & -1 & | & 4\\0 & 0 & 0 & | & 0\end{bmatrix}\), showing that the solution set is \(\{(4-2t_1+t_2, t_1, t_2)\ |\ t_1, t_2\in \mathbb{R}\}\), a two parameter solution set. We then recorded the fact that if we start with a system of two linear equations in three unknowns, the original augmented matrix \(\begin{bmatrix}a & b & c & | & u\\d & e & f & | & v\end{bmatrix}\) takes one of the following three forms after performing elementary row operations:

A
\[\begin{bmatrix}1 & 0 & * & | & *\\0 & 1 & * & | & *\end{bmatrix}\]
B
\[\begin{bmatrix}1 & * & * & | & *\\0 & 0 & 0 & | & 0\end{bmatrix}\]
C
\[\begin{bmatrix}1 & * & * & | & *\\0 & 0 & 0 & | & \alpha\end{bmatrix}\]

where \(\alpha \neq 0\) in case C. We noted that the solution set is infinite in case A, with a one parameter solution, the solution set in Case B is infinite with a two parameter solution set and that there was no solution in Case C. In particular, no \(2\times 3\) system of linear equations has a unique solution. We further noted that in all cases we have seen, there is the

Important fact. The number of independent parameters needed to describe the solution set is the number of variables minus the number of leading ones that appear in the final augmented matrix.

After noting that this important fact applies to systems of equations of any size, we then formalized the form the final matrix should take in the Gaussian elimination process. This applies to systems of linear equations of any size.

Reduced Row Echelon Form. An augmented matrix is in reduced row echelon form (RREF) if it satisfies the following conditions:

  1. (i) The leading entry of each non-zero row is 1.
  2. (ii) There are only zeros above and below each non-zero leading entry.
  3. (iii) The non-zero leading entries move to the right as one moves down the rows.
  4. (iv) All zero rows are at the bottom of the matrix.

We ended class by noting that the technique of Gaussian elimination, namely converting the system to an augmented matrix and applying elementary row operations to get a RREF, applies to systems of linear equations of any size and we illustrated this by solving the system

\[\begin{align*} x+y+z &= 6\\ 2x+4y+6z &= 28\\ 5x+7y+9z &= 46 \end{align*}\]

which had the solution \(x = 1, y = 2, z = 3\).

Thursday, February 5

We began class by reviewing the possible outcomes in terms of the RREFs for an augmented matrix representing a \(3\times 3\) system of linear equations. We then reminded the class of the important fact that given any system of linear equations, the number independent parameters needed to describe the solution set is the number of variables minus the number of leading 1s in the RREF of the corresponding augmented matrix. We also noted that a system of linear equations is said to be homogeneous if the right hand side of the system consists entirely of zeros. In this case, there will always be at least one solution, namely the solution in which all of the given variables equal zero.

We then noted the following

General fact. Suppose \(A\) is an \(n\times m\) matrix and \(B\) is a \(p\times t\) matrix. Then we can only form the product \(AB\) when \(m = p\). In this case \(AB\) is an \(n\times t\) matrix whose \((i,j)\) entry is the \(i\)th row of \(A\) times the \(j\)th column of \(B\).

After computing a couple of products, we noted that the product rules (i)-(v) from January 22 hold, as long as the product exists. We then began a discussion of elementary \(2\times 2\) matrices.

Elementary \(2\times 2\) matrices. The elementary matrices are obtained by applying elementary row operations to \(I_2 = \begin{pmatrix} 1 & 0\\0 & 1\end{pmatrix}\).

  1. (i) Type I: Interchange the rows of \(I_2\), i.e., \(E = \begin{pmatrix} 0 & 1\\1 & 0\end{pmatrix}\).
  2. (ii) Type II: Add a multiple of one row of \(I_2\) to another, e.g., \(E = \begin{pmatrix} 1 & 0\\ \lambda & 1\end{pmatrix}\) or \(E = \begin{pmatrix} 1 & \lambda\\0 & 1\end{pmatrix}\), for \(\lambda \in \mathbb{R}\).
  3. (iii) Type III: Multiply a row of \(I_2\) by a non-zero number, e.g., \(E = \begin{pmatrix} \lambda & 0\\0 & 1\end{pmatrix}\) or \(E = \begin{pmatrix} 1 & 0\\0 & \lambda\end{pmatrix}\).

The class then verified (in the \(2\times 2\) case) that for any elementary matrix \(E\), \(EA\) is the same matrix obtained by applying the corresponding elementary row operation to \(A\). We then noted that elementary matrices are invertible, and their inverses are elementary matrices that can be easily guessed.

We then asked what does it mean in terms of elementary matrices if \(A\) is a \(2\times 2\) matrix that reduces to \(I_2\) via elementary row operations. We inferred that there exists a \(2\times 2\) matrix \(H\) such that \(HA = I_2\) and thus (via the bonus problem listed in today's homework) \(A\) is invertible and \(A^{-1}\) can be found via Gaussian elimination, as stated below.

Using Gaussian elimination to find \(A^{-1}\), if it exists. Given an \(n\times n\) matrix \(A\), start with the \(n\times (2n)\) augmented matrix \([A\ |\ I_n]\) and apply elementary row operations until either:

  1. (i) One arrives at \([I_n\ | \ B]\), in which case \(B = A^{-1}\) or
  2. (ii) At some point the left hand side of the augmented matrix has a row consisting entirely of zeros, in which case \(A\) does not have an inverse.
In terms of elementary matrices, if \(E_1, ..., E_r\) the are elementary matrices corresponding to the row operations used in (i), then \(A^{-1} = E_r \cdots E_2 E_1\).
Tuesday, February 10

We began class by thinking of \(\mathbb{R}^2\) as a vector space even though we have not yet formally defined the general concept of a vector space. We noted that the elements of \(\mathbb{R}^2\) can be thought of as row or column vectors and as such one can add two vectors or scalar multiply a vector by a real number exactly as one does in Calculus II. We then noted that the elements of \(\mathbb{R}^2\) satisfy the Fundamental Properties from the lecture of January 20. We then pointed out (informally) that it is precisely these properties that make any set with addition and scalar multiplication into a vector space. We then gave the following

Definition-Proposition. Take \(v_1, v_2\in \mathbb{R}^2\). Then \(v_1, v_2\) are said to be linearly independent if the following equivalent statements hold:

  1. (i) If \(\alpha v_1+\beta v_2 = \vec{0}\), for \(\alpha,\beta \in \mathbb{R}\), then \(\alpha = 0 = \beta\).
  2. (ii) We cannot write \(v_1 = \lambda v_2\) or \(v_2 = \lambda v_1\), for \(0\neq \lambda \in \mathbb{R}\).

We noted that geometrically, condition (ii) just says that \(v_1\) and \(v_2\) do not lie on the same line through the origin in \(\mathbb{R}^2\). We also noted that if one of \(v_1, v_2\) is \(\vec{0}\), then \(v_1, v_2\) cannot be linearly independent. We also defined \(v_1, v_2\) to be linearly dependent if they are not linearly independent. We noted that the vectors \((1,2), (2,1)\) are linearly independent, while the vectors \((1,2), (4,8)\) are linearly dependent.

We then proved the following important theorem.

Theorem. Take vectors \(v_1 = (a,b), v_2 = (c,d) \in \mathbb{R}^2\) and set \(A := \begin{pmatrix} a & b\\c & d\end{pmatrix}\). Then \(v_1, v_2\) are linearly independent if and only if \(\det A \neq 0\).

We ended class by noting, but not verifying, that if \(v_1, v_2 \in \mathbb{R}^2\) are linearly independent, then given any vector \(u\in \mathbb{R}^2\), we can find (unique) \(\alpha, \beta \in \mathbb{R}\) such that \(u = \alpha v_1+\beta v_2\).

Thursday, February 12

We began class by reviewing the definition of what it means for two vectors in \(\mathbb{R}^2\) to be linearly independent. This led to a discussion and example of how any vector in \(\mathbb{R}^2\) is a linear combination of a fixed set of linearly independent vectors. In particular, we showed how \((6,9)\) can be written as a linear combination of the independent vectors \((2,3)\) and \((1,3)\). We then formally gave the following definitions.

Definitions. Given \(v_1, v_2\in \mathbb{R}^2\):

  1. (i) A linear combination of \(v_1, v_2\) is an expression of the form \(\alpha v_1+\beta v_2\), with \(\alpha, \beta \in \mathbb{R}\).
  2. (ii) The span of \(v_1, v_2\), denoted \(\langle v_1, v_2\rangle\), is the set of all possible linear combinations of \(v_1, v_2\).

We then established the following theorem:

Theorem. Given vectors \(v_1, v_2\in \mathbb{R}^2\), \(v_1, v_2\) are linearly independent if and only if \(\langle v_1, v_2\rangle = \mathbb{R}^2\).

Important facts. The equivalence in the theorem above is a function of taking two vectors in \(\mathbb{R}^2\). In \(\mathbb{R}^3\), two independent vectors never span \(\mathbb{R}^3\) and four spanning vectors are never linearly independent. However, in \(\mathbb{R}^3\), three vectors are linearly independent if and only if they span \(\mathbb{R}^3\). We will discuss the general notions of linear independence and spanning in the near future.

We ended class by observing that a line through the origin in \(\mathbb{R}^2\) is closed under vector addition and scalar multiplication, and noted that these properties will determine the concept of subspace to be discussed next week.

Tuesday, February 17

We began class with the following important definition.

Definition. A function \(T: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation if:

  1. (i) \(T(v_1+v_2) = T(v_1)+T(v_2)\), for all \(v_1, v_2\in \mathbb{R}^2\)
  2. (ii) \(T(\lambda v) = \lambda T(v)\), for all \(\lambda \in \mathbb{R}\) and \(v\in \mathbb{R}^2\).

Examples of linear transformations given in class were:

  1. (i) \(T(x,y) = (2x+2y, -6x+2y)\);
  2. (ii) \(T(x,y) = (-y,x)\), rotation 90 degrees counterclockwise;
  3. (iii) \(T(x,y) = (y,-x+2y)\).

We noted that the second linear transformation has the property that no line through the origin is mapped to itself by the linear transformation (i.e., \(T\) has no eigenvectors), while the third linear transformation maps the line \(y = x\) to itself (i.e., the vector \((1,1)\) is an eigenvector).

We then noted that a linear transformation \(T: \mathbb{R}^2\to \mathbb{R}^2\) is totally determined by its effect on the standard basis of \(\mathbb{R}^2\) given by \(E = \{e_1, e_2\}\), where \(e_1 = (1,0)\) and \(e_2 = (0,1)\). In fact, if \(v = (a,b)\), then

\[T(v) = T(a,b) = T((a,0)+(0,b)) = T(a,0)+T(0,b) = aT(1,0)+bT(0,1) = aT(e_1)+bT(e_2),\]

showing that the value of \(T(v)\) is already determined by the values of \(T(e_1)\) and \(T(e_2)\).

We then gave the following definition, which we noted is a special case of a more general definition to come. We first reminded the class of the theorem from the previous lecture which showed that \(v_1, v_2\in \mathbb{R}^2\) are linearly independent if and only if they span \(\mathbb{R}^2\).

Definition. Two vectors \(v_1, v_2\in \mathbb{R}^2\) form a basis for \(\mathbb{R}^2\) if they span \(\mathbb{R}^2\), or equivalently, they are linearly independent.

After noting that the standard basis for \(\mathbb{R}^2\) mentioned above is a basis for \(\mathbb{R}^2\), we gave the following important definition.

Definition. Let \(T: \mathbb{R}^2\to \mathbb{R}^2\) be a linear transformation, \(\alpha := \{v_1, v_2\}\) and \(\beta := \{w_1, w_2\}\) be bases for \(\mathbb{R}^2\). Then the matrix of T with respect to \(\alpha\) and \(\beta\) is the matrix \([T]_{\alpha}^{\beta} = \begin{pmatrix} a & c\\b & d\end{pmatrix}\), where \(T(v_1) = aw_1+bw_2\) and \(T(v_2) = cw_1+dw_2\).

We then noted that \([T]^E_E\) is the easiest matrix to calculate, namely if \(T(x,y) = (ax+cy, bx+dy)\), then \([T]_E^E = \begin{pmatrix} a & c\\b & d\end{pmatrix}\). From this we noted that if \(v = (e,f)\), then in terms of columns, \(T(v) = T(e,f) = A\begin{pmatrix} e\\f\end{pmatrix}\), where \(A = [T]_E^E\). This equation always holds true as long as we take the standard basis for both the input basis and output basis.

We finished class by noting that if \(T(x,y) = (2x-4y, x+7y)\), then \([T]_E^E = \begin{pmatrix} 2 & -4\\1 & 7\end{pmatrix}\), while if \(B = \{w_1, w_2\}\) with \(w_1 = (1,1)\) and \(w_2 = (1,2)\), \([T]_B^E = \begin{pmatrix} -2 & -6\\8 & 15\end{pmatrix}\), which follows easily from the fact that \(T(1,1) = (-2,8)\) and \(T(1,2) = (-6,15)\). However, to calculate \([T]_E^B\), we saw that this required solving two systems of equations, namely

\[r\begin{pmatrix} 1\\1\end{pmatrix} + s\begin{pmatrix} 1\\2\end{pmatrix} = \begin{pmatrix} 2\\1\end{pmatrix}\quad\quad \textrm{and}\quad\quad u\begin{pmatrix} 1\\1\end{pmatrix} + v\begin{pmatrix} 1\\2\end{pmatrix} = \begin{pmatrix} -4\\7\end{pmatrix}.\]
Thursday, February 19

After recalling the definition of a basis for \(\mathbb{R}^2\), linear transformation and the matrix of a linear transformation with respect to two bases, we considered the following question: Suppose we are given \(S,T: \mathbb{R}^2\to \mathbb{R}^2\) are two linear transformations. Then the composition \(ST:\mathbb{R}^2\to \mathbb{R}^2\) is also a linear transformation. Can a matrix of \(ST\) be expressed in terms of matrices for \(S\) and \(T\)?

Very Important Formula. Suppose \(S, T: \mathbb{R}^2\to \mathbb{R}^2\) are linear transformations, and \(\alpha, \beta, \gamma\) are bases for \(\mathbb{R}^2\). Then \(ST: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation and

\[[ST]_{\alpha}^{\gamma} = [S]_{\beta}^{\gamma}\cdot [T]_{\alpha}^{\beta}.\]

After proving this formula, we described change of basis matrices, by noting that if \(\alpha, \beta\) are bases for \(\mathbb{R}^2\), then \([I_2]_{\alpha}^{\beta}\) is the matrix obtained by expressing the basis elements of \(\alpha\) in terms of the basis \(\beta\). Similarly, \([I_2]_{\beta}^{\alpha}\) is the matrix obtained by expressing the basis elements of \(\beta\) in terms of \(\alpha\). Using the very important formula above we saw that the matrices are inverses of one another:

\[I_2 = [I_2]_{\beta}^{\beta} = [I_2]_{\alpha}^{\beta} \cdot[I_2]_{\beta}^{\alpha}\quad \textrm{and}\quad I_2 = [I_2]_{\alpha}^{\alpha} = [I_2]_{\beta}^{\alpha} \cdot[I_2]_{\alpha}^{\beta}.\]

This led to the fundamental formula, which follows from the very important formula.

Change of Basis Formula. Suppose \(T: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation and \(\alpha, \beta\) are bases for \(\mathbb{R}^2\). Then

\[[T]_{\beta}^{\beta} = [I_2]_{\alpha}^{\beta}\cdot [T]_{\alpha}^{\alpha}\cdot [I_2]_{\beta}^{\alpha}.\]

In particular, if we set \(A := [T]_{\alpha}^{\alpha}\) , \(B := [T]_{\beta}^{\beta}\) and \(P:= [I_2]_{\beta}^{\alpha}\), we obtain the classic expression

\[B = P^{-1}AP.\]
Tuesday, February 24

The class worked on practice problems for Exam I.

Thursday, February 26

Exam 1.

Tuesday, March 3

After making a few comments about Exam 1, we looked at the following

Example A. For the matrix \(A = \begin{pmatrix} 0 & -2\\1 & 3\end{pmatrix}\), find a non-zero column vector \(v\in \mathbb{R}^2\) and \(\lambda \in \mathbb{R}\) such that \(Av = \lambda v\).

We approached this example by setting \(v = \begin{pmatrix} x\\y\end{pmatrix}\) and setting up a system of equations, that ultimately became the matrix equation \(\begin{pmatrix} -\lambda & -2\\1 & -\lambda+3\end{pmatrix} \begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix} 0\\0\end{pmatrix}\). Since the vector \(v\) should be non-zero, the matrix equation should have a non-trivial solution, which happens when \(\det \begin{pmatrix} -\lambda & -2\\1 & -\lambda+3\end{pmatrix} = 0\). This led to the polynomial equation \(\lambda^2-3\lambda+2 = 0\), which has solutions \(\lambda = 2\) and \(\lambda = 1\). These are the values of \(\lambda\) we seek. What about the corresponding vectors? We saw that when \(\lambda = 2\), the matrix equation becomes \(\begin{pmatrix} -2 & -2\\1 & 1\end{pmatrix} \begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix}0\\0\end{pmatrix}\), which is easily seen to have a non-trivial solution \(v_1 = \begin{pmatrix} 1\\-1\end{pmatrix}\). Similarly, we saw that when \(\lambda = 1\), the matrix equation has coefficient matrix \(\begin{pmatrix} -1 & -2\\1 & 2\end{pmatrix}\), so \(v_2 = \begin{pmatrix} 2\\-1\end{pmatrix}\) is a non-trivial solution. Thus, we found two vectors \(v_1, v_2\) such that \(Av_1 = 2\cdot v_1\) and \(Av_2 = 1\cdot v_2\). We also noted that any multiple of \(v_1\) works for \(\lambda = 2\) and any multiple of \(v_2\) works for \(\lambda = 1\).

We then noted that \(v_1, v_2\) form a basis for \(\mathbb{R}^2\), since the determinant of \(P := \begin{pmatrix} 1 & 2\\-1 & -1\end{pmatrix}\) is not zero. We then saw that \(P^{-1} = \begin{pmatrix} -1 & -2\\1 & 1\end{pmatrix}\) and that if we set \(B = \begin{pmatrix} 2 & 0\\0 & 1\end{pmatrix}\), then \(B = P^{-1}AP\). Thus, we saw

Summary of Example A. First, there exist non-zero vectors \(v_1, v_2 \in \mathbb{R}^2\) with \(Av_1 = 2\cdot v_1\) and \(Av_2 = 1\cdot v_2\). The values 2 and 1 were found by solving the equation obtained by setting \(\det \begin{pmatrix}-\lambda & -2\\1 & -\lambda +3\end{pmatrix} = 0\); and second, if we take \(P\) to be the matrix with columns \(v_1, v_2\), then \(P^{-1}AP = \begin{pmatrix} 2 & 0\\0 & 1\end{pmatrix}\).

This led to the following

Definitions. Suppose \(A\in \textrm{M}_2(\mathbb{R})\).

  1. (i) \(\lambda \in \mathbb{R}\) is an eigenvalue of \(A\) if there exists a non-zero vector \(v\in \mathbb{R}^2\) such that \(Av = \lambda v\). Any non-zero vector \(v\) satisfying \(Av = \lambda v\) is an eigenvector of \(A\) associated to \(\lambda\).
  2. (ii) \(A\) is said to be diagonalizable if there exists an invertible matrix \(P\in \textrm{M}_2(\mathbb{R})\) such that \(P^{-1}AP\) is a diagonal matrix, i.e., \(P^{-1}AP = \begin{pmatrix} \lambda_1 & 0\\0 & \lambda_2\end{pmatrix}\), for \(\lambda_1, \lambda_2\in \mathbb{R}\).

This led to formalizing the process of finding eigenvalues.

Proposition-Definition. Let \(A\in \textrm{M}_2(\mathbb{R})\). The eigenvalues of \(A\) are found by solving the polynomial equation \(\det (A-\lambda I_2) = 0\). The resulting polynomial \(\det (A-\lambda I_2)\) is called the characteristic polynomial of \(A\). If \(\lambda\) is an eigenvalue, then eigenvectors corresponding to \(\lambda\) are obtained by taking non-zero solutions to the system of equations \((A-\lambda I_2)\begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix} 0\\0\end{pmatrix}\).

We ended class by repeating the steps in Example A for the matrix \(B = \begin{pmatrix} -1 & 2\\4 & -3\end{pmatrix}\).

Thursday, March 5

We began class by reviewing the following for \(A \in \textrm{M}_2(\mathbb{R})\):

  1. (i) \(\lambda \in \mathbb{R}\) is an eigenvalue of \(A\) if \(Av = \lambda v\), for some non-zero vector \(v\in \mathbb{R}^2\).
  2. (ii) Any non-zero vector satisfying \(Av = \lambda v\) is an eigenvector of \(A\) associated to the eigenvalue \(\lambda\).
  3. (iii) The eigenvalues are found by solving the equation \(\det (A-\lambda I_2) = 0\) for \(\lambda\). In other words, \(\alpha\) is an eigenvalue of \(A\) if and only if \(\alpha\) is a root of the polynomial \(p_A(x) := \det (A-xI_2)\). The polynomial \(p_A(x)\) is called the characteristic polynomial of \(A\).
  4. (iv) If \(\alpha\) is an eigenvalue of \(A\), the corresponding eigenvectors are the vectors \(v\in \mathbb{R}^2\) solving the matrix equation \((A-\alpha I_2)\cdot v = \vec{0}\), or equivalently, the ordered pairs that are solutions to the system of equations \[\begin{aligned}(-\alpha +a)x+by &=0\\ cx+(-\alpha + d)y &= 0,\end{aligned}\] where \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\).
  5. (v) \(A\) is said to be diagonalizable if there exists invertible \(P\in \textrm{M}_2(\mathbb{R})\) such that \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), for some \(\alpha, \beta \in \mathbb{R}\).

We then considered the following

Example. Taking \(A = \begin{pmatrix} 2 & 1\\1 & 2\end{pmatrix}\), we found the eigenvalues 3, 1 with corresponding eigenvectors \(v_1 = \begin{pmatrix} 1\\-1\end{pmatrix}\) and \(v_2 = \begin{pmatrix} 1\\1\end{pmatrix}\). We then showed that for \(P = \begin{pmatrix} 1 & 1\\-1 & 1\end{pmatrix}\), \(P^{-1}AP = \begin{pmatrix} 1 & 0\\0 & 3\end{pmatrix}\).

We then observed that the characteristic polynomial of the matrix \(A = \begin{pmatrix} 0 & -1\\1 & 0\end{pmatrix}\) is \(x^2+1\), so \(A\) does not have any (real) eigenvalues. This was explained geometrically, since multiplication of a vector in \(\mathbb{R}^2\) by \(A\) rotates the vector 90 degrees counter-clockwise, so that no vector in \(\mathbb{R}^2\) is mapped to a multiple of itself.

We then stated the following

Key Observations making theoretical discussions easier. Let \(H\) be a \(2\times 2\) matrix over \(\mathbb{R}\).

  1. (i) For \(w = \begin{pmatrix} \alpha\\\beta\end{pmatrix} \in \mathbb{R}^2\), \(Hw = \alpha C_1+\beta C_2\), where \(C_1, C_2\) are the columns of \(H\).
  2. (ii) Let \(L = [D_1\ D_2]\), i.e., \(L\) is a \(2\times 2\) matrix whose columns are \(D_1, D_2\). Then \(HL = [HD_1\ HD_2]\).

We ended class by stating, but not proving the following theorem, which formalizes the processes used in our previous examples.

Diagonalizability Theorem. Let \(A\) be a \(2\times 2\) matrix with entries in \(\mathbb{R}\).

  1. (i) Suppose \(A\) has two linearly independent eigenvectors \(v_1, v_2\), so that \(v_1, v_2\) form a basis for \(\mathbb{R}^2\). Then \(A\) is diagonalizable. More explicitly, if \(Av_1 = \alpha v_1\) and \(Av_2 = \beta v_2\), then \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), where \(P\) is the \(2\times 2\) matrix whose columns are \(v_1\) and \(v_2\).
  2. (ii) Suppose \(A\) is diagonalizable, i.e., there exists an invertible \(2\times 2\) matrix \(P\) such that \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), for \(\alpha, \beta \in \mathbb{R}\). Then, \(\alpha, \beta\) are the eigenvalues of \(A\), and if \(v_1, v_2\) are the columns of \(P\), \(Av_1 = \alpha v_1\) and \(Av_2 = \beta v_2\). In particular, \(\mathbb{R}^2\) has a basis consisting of eigenvectors of \(A\), since the columns of \(P\) form a basis for \(\mathbb{R}^2\).

Thus, \(A\) is diagonalizable if and only if \(\mathbb{R}^2\) has a basis consisting of eigenvectors of \(A\).

Tuesday, March 10

We began class by re-stating the Diagonalizability Theorem from the end of the previous lecture. With the theorem in mind, we then considered the following three scenarios.

Three possibilities. Suppose \(A \in \mathrm{M}_2(\mathbb{R})\), so that \(p_A(x)\) is a degree two polynomial with real coefficients. One of the following scenarios holds:

  1. (i) \(p_A(x)\) has two distinct (real) roots.
  2. (ii) \(p_A(x)\) has a repeated root, i.e., \(p_A(x) = (x-\lambda)^2\), for some \(\lambda \in \mathbb{R}\).
  3. (iii) \(p_A(x)\) has no real roots.

We noted that we have previously seen examples of types (i) and (iii). We then noted that for \(A = \begin{pmatrix} 2 & 7\\0 & 2\end{pmatrix}\), \(p_A(x) = (x-2)^2\) and that all eigenvectors are multiples of \(v = \begin{pmatrix} 1\\0\end{pmatrix}\), so that by the theorem, \(A\) is not diagonalizable.

We then gave a proof of the theorem characterizing diagonalizability. The proof of the theorem relied heavily on the two Key Observations from the previous lecture. We finished class by presenting the following very important corollary to the diagonalizability theorem.

Corollary. Suppose \(A\in \mathrm{M}_2(\mathbb{R})\) has two distinct eigenvalues. Then \(A\) is diagonalizable.
Thursday, March 12

Most of the class was spent discussing how diagonalizability of a \(2\times 2\) matrix can be used to solve a coupled system of two linear first order differential equations. In particular, given the system

\[\begin{align*} x_1'(t) &= ax_1(t)+bx_2(t)\\ x_2'(t) &= cx_1(t)+dx_2(t), \end{align*}\]

assuming \(A\) is diagonalizable, we worked through the derivation of the solution to the system and found that the solution took the form

\[\begin{pmatrix} x_1(t)\\x_2(t)\end{pmatrix} = c_1e^{\alpha t}v_1+c_2e^{\beta t}v_2,\]

where \(\alpha, \beta\) are the eigenvalues of \(A\), with corresponding eigenvectors \(v_1, v_2\in \mathbb{R}^2\), and where \(c_1, c_2\in \mathbb{R}\) are determined by the initial conditions \(\begin{pmatrix} x_1(0)\\x_2(0)\end{pmatrix} = P\cdot \begin{pmatrix} c_1\\c_2\end{pmatrix}\), for \(P = [v_1\ v_2]\).

We finished class by using the derivation above to solve the system

\[\begin{align*} x_1'(t) &= 2x_1(t)+x_2(t)\\ x_2'(t) &= x_1(t)+2x_2(t) \end{align*}\]

with initial conditions \(x_1(0) = 3, x_2(0) = -4\).

Tuesday, March 24

We began class by discussing our second application of diagonalizability, namely, that if \(A\) is diagonalizable, it is easy to write down a formula for the \(n\)th power of \(A\), as follows, using the fact that for any square matrix \(C\), \((PCP^{-1})^n = PC^nP^{-1}\): If \(P^{-1}AP = D = \begin{pmatrix} \alpha & 0\\0 & \beta \end{pmatrix}\), then \(A = PDP^{-1}\) and

\[A^n = PD^nP^{-1} = P\begin{pmatrix} \alpha^n & 0\\0 & \beta^n\end{pmatrix} P^{-1}.\]

We then calculated \(A^n\) for \(A = \begin{pmatrix} -6 & 10\\-5 & 9\end{pmatrix}\) and \(n\geq 1\), and found that \(A^n = \begin{pmatrix} 2(-1)^n - 4^n & 2(-1)^{n+1}+2\cdot 4^n\\(-1)^{n}-4^n & (-1)^{n+1}+2\cdot 4^n\end{pmatrix}\).

We then defined \(e^A\), for \(A\) a diagonalizable matrix, by substituting \(A\) into the partial sum \(1+x+\frac{1}{2!}x^2+\frac{1}{3!}x^3+\cdots+\frac{1}{n!}x^n\) and taking a limit in each entry. We noted, but did not prove, that each entry does converge to a real number so that \(e^A\) does exist. We noted that the entries of \(e^A\) are not \(e\) raised to the corresponding entry, but did note the important fact that if \(D = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), then \(e^D = \begin{pmatrix} e^{\alpha} & 0\\0 & e^{\beta}\end{pmatrix}\). When \(A\) is diagonalizable and \(P^{-1}AP = D\), putting \(A = PDP^{-1}\) into the partial sum expression enabled us to see that \(e^A = Pe^DP^{-1}\), which can easily be calculated.

We then calculated \(e^A\), for \(A = \begin{pmatrix} -6 & 10\\-5 & 9\end{pmatrix}\) and got \(e^A = \begin{pmatrix} 2e^{-1}-e^4 & -2e^{-1}+2e^4\\e^{-1}-e^4 & -e^{-1}+2e^4\end{pmatrix}\).

We then reconciled our two applications by noting that if

\[\begin{align*} x_1'(t) &= ax_1(t)+bx_2(t)\\ x_2'(t) &= cx_1(t)+dx_2(t) \end{align*}\]

is a coupled system of first order linear differential equations that can be written as \(X'(t) = A\cdot X(t)\), with \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), then the solution takes the very agreeable form \(X(t) = e^{At}\cdot \begin{pmatrix} x_1(0)\\x_2(0)\end{pmatrix}\), which almost looks like the solution in the simple case of one function and one equation.

We ended class with yet a third application of diagonalizability, namely calculating the general term in the Fibonacci sequence: \(1, 1, 2, 3, 5, 8, 13, \ldots\), where \(a_0 = 1, a_1 = 1, a_2 = 2, a_3 = 3, \ldots\), and \(a_{k+2} = a_{k+1}+a_k\), for \(k\geq 0\). We noted that \(A^k\cdot v_0 = \begin{pmatrix} a_k\\a_{k+1}\end{pmatrix}\), for \(A = \begin{pmatrix} 0 & 1\\1 & 1\end{pmatrix}\) and \(v_0 = \begin{pmatrix} 0\\1\end{pmatrix}\). We saw that \(A\) was diagonalizable with eigenvalues \(\lambda_1 = \frac{1+\sqrt{5}}{2}\), \(\lambda_2 = \frac{1-\sqrt{5}}{2}\), so that we were able to calculate \(A^k\). This led to the closed form formula for the \(k\)th Fibonacci number as

\[a_k = \frac{1}{\sqrt{5}}\cdot \left\{ \left(\frac{1+\sqrt{5}}{2}\right)^k - \left(\frac{1-\sqrt{5}}{2}\right)^k\right\},\]

which is an integer!

Thursday, March 26

We began today's lecture by considering the following:

Definitions. Let \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix} \in \mathrm{M}_2(\mathbb{R})\).

  1. (i) The transpose of \(A\), denoted \(A^t\), is the matrix \(A^t = \begin{pmatrix} a & c\\b & d\end{pmatrix}\).
  2. (ii) \(A\) is said to be symmetric if \(A = A^t\), i.e., it has the form \(\begin{pmatrix} a & b\\b & c\end{pmatrix}\).

We then considered the symmetric matrix \(A = \begin{pmatrix} 2 & 1\\1 & 2\end{pmatrix}\), which had eigenvalues 1, 3 with associated eigenvectors \(v_1 = \begin{pmatrix} 1\\-1\end{pmatrix}\) and \(v_2 = \begin{pmatrix} 1\\1\end{pmatrix}\) respectively. We noted the crucial fact that \(v_1\) and \(v_2\) are orthogonal. We then noted that if we take \(u_1 := \frac{1}{\sqrt{2}}v_1\) and \(u_2 = \frac{1}{\sqrt{2}}v_2\), we now have two orthogonal vectors of unit length. In this case we call \(\{u_1, u_2\}\) an orthonormal basis for \(\mathbb{R}^2\). We then noted that if \(Q = [u_1\ u_2]\), then \(Q^{-1}AQ = \begin{pmatrix} 1 & 0\\0 & 3\end{pmatrix}\), so that \(A\) is orthogonally diagonalizable. This led to the following:

Definitions. Let \(Q\in \mathrm{M}_2(\mathbb{R})\).

  1. (i) \(Q\) is an orthogonal matrix if its columns form an orthonormal basis for \(\mathbb{R}^2\), i.e., the columns of \(Q\) have length one and are orthogonal.
  2. (ii) \(A\in \mathrm{M}_2(\mathbb{R})\) is orthogonally diagonalizable if there exists an orthogonal matrix \(Q\in \mathrm{M}_2(\mathbb{R})\) such that \(Q^{-1}AQ\) is a diagonal matrix.

We then worked through the details showing that the symmetric matrix \(B = \begin{pmatrix} 2 & -2\\-2 & -1\end{pmatrix}\) is orthogonally diagonalizable. We were then able to state the important theorem.

Spectral Theorem for Real Symmetric \(2\times 2\) Matrices. Let \(A\in \mathrm{M}_2(\mathbb{R})\) be a symmetric matrix. Then \(A\) is orthogonally diagonalizable.

We finished class by using the quadratic formula to work through the following proposition which gives us crucial information about \(2\times 2\) real symmetric matrices.

Proposition A. Let \(A\) be a real \(2\times 2\) symmetric matrix, and assume that \(A\) is not a scalar matrix. Then \(A\) has two distinct real eigenvalues.

We immediately noted that Proposition A tells us that any real \(2\times 2\) symmetric matrix is diagonalizable (by the Corollary from the lecture of March 10).

Tuesday, March 31

We continued our discussion of symmetric \(2\times 2\) matrices over \(\mathbb{R}\) by restating the Spectral Theorem from the previous lecture and recalling the terminology in the statement of the theorem. We also recalled that Proposition A from the previous lecture guarantees that a real \(2\times 2\) symmetric matrix that is not a scalar matrix has two distinct eigenvalues and is thus diagonalizable. Thus the key to orthogonal diagonalizability rests with the fact that the associated eigenvectors are orthogonal. For this we needed

A property of the dot product. Let \(A\in \mathrm{M}_2(\mathbb{R})\) and suppose \(v_1, v_2\in \mathbb{R}^2\) are column vectors. Then \((Av_1)\cdot v_2 = v_1\cdot (A^tv_2)\). In particular, if \(A\) is symmetric, then \((Av_1)\cdot v_2 = v_1\cdot (Av_2)\).

We had each half of the class calculate one side of the equation, and two students presented their calculation at the board. With this property in hand we were able to establish the following crucial fact.

Proposition B. Let \(A\) be a non-scalar symmetric \(2\times 2\) real matrix with distinct eigenvalues \(\lambda_1, \lambda_2\) and associated eigenvectors \(v_1, v_2\). Then \(v_1\cdot v_2 = 0\), i.e., \(v_1\) and \(v_2\) are orthogonal.

We were then able to give

Proof of the Spectral Theorem. We may assume \(A\) is not a scalar matrix, as there is nothing to prove in this case. Thus, by Proposition A, the matrix \(A\) has distinct eigenvalues say \(\alpha\) and \(\beta\), and hence by Proposition B, any choice of corresponding eigenvectors \(v_1, v_2\) are orthogonal. Thus, if we set \(u_1 := \frac{1}{||v_1||}\cdot v_1\), \(u_2 := \frac{1}{||v_2||}\cdot v_2\), then \(u_1\) is an eigenvector associated with \(\alpha\), \(u_2\) is an eigenvector associated with \(\beta\), \(u_1, u_2\) are orthogonal and have length one, and thus \(Q = [u_1\ u_2]\) is an orthogonal matrix satisfying \(Q^{-1}AQ = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), which gives what we want. \(\square\)

Thursday, April 2

We began class by recalling that if \(A\in \mathrm{M}_2(\mathbb{R})\), then we have three possibilities for \(p_A(x)\): Either \(p_A(x)\) has two distinct real roots, \(p_A(x)\) has a repeated real root, or \(p_A(x)\) does not have a real root. We noted that we have seen many times that in the first case, \(A\) is diagonalizable, while in the second case, if \(A\) is not a scalar matrix, then if \(\lambda\) is the repeated root of \(p_A(x)\), then the eigenspace of \(\lambda\), namely, the nullspace of \(A-\lambda I_2\), is one dimensional, and thus \(A\) is not diagonalizable.

We then analyzed in depth the example \(A = \begin{pmatrix} 1 & -2\\2 & 5\end{pmatrix}\). In this case \(p_A(x) = (x-3)^2\). We confirmed \(A\) is not diagonalizable by noting that the solution space of the homogeneous system with coefficient matrix \(\begin{pmatrix} 1-3 & -2\\2 & 5-3\end{pmatrix}\) is spanned by the vector \(w := \begin{pmatrix} 1\\-1\end{pmatrix}\). We took \(v_2 = \begin{pmatrix} 1\\0\end{pmatrix}\), which is not an eigenvector. We then set \(v_1 := (A-3I_2)v_2\) to get \(v_1 = \begin{pmatrix} -2\\2\end{pmatrix}\), which is an eigenvector. Taking \(P = [v_1\ v_2] = \begin{pmatrix} -2 & 1\\2 & 0\end{pmatrix}\) we saw that \(P^{-1}AP = \begin{pmatrix} 3 & 1\\0 & 3\end{pmatrix}\). We noted that this latter matrix is the Jordan canonical form of \(A\).

To explain why this example works, we noted that from the definition of \(v_1\), we have \(v_1 = Av_2-3v_2\), so that \(Av_2 = v_1+3v_2\). Thus,

\[\begin{aligned} AP &= A\cdot [v_1\ v_2] = [Av_1\ Av_2] = [Av_1\ v_1+3v_2]\\ &= [3v_1\ v_1+3v_2] = [v_1\ v_2]\begin{pmatrix} 3 & 1\\0 & 3\end{pmatrix} = P\begin{pmatrix} 3 & 1\\0 & 3\end{pmatrix}, \end{aligned}\]

which gives \(P^{-1}AP = \begin{pmatrix} 3 & 1\\0 & 3\end{pmatrix}\).

We then stated the following

Process for finding a Jordan canonical form and the change of basis matrix. Suppose \(A\in \mathrm{M}_2(\mathbb{R})\) satisfies \(p_A(x) = (x-\lambda)^2\) and \(A\) is not a scalar matrix. We find the Jordan canonical form of \(A\) as follows:

  1. (i) Find the nullspace of \((A-\lambda\cdot I_2)\), i.e., the eigenspace associated to \(\lambda\). This will be a one dimensional subspace of \(\mathbb{R}^2\).
  2. (ii) Choose any column vector \(v_2\in \mathbb{R}^2\) that is not an eigenvector, i.e., not in the eigenspace found in the previous step.
  3. (iii) Set \(v_1 := (A-\lambda\cdot I_2)v_2\). This vector is an eigenvector.
  4. (iv) Upon setting \(P = [v_1\ v_2]\) we obtain \(P^{-1}AP = \begin{pmatrix} \lambda & 1\\0 & \lambda\end{pmatrix}\), the Jordan canonical form of \(A\).

We ended class by briefly discussing what we will do in the three remaining classes before the second midterm exam on April 16.

Tuesday, April 7

Today we spent most of the class discussing properties of the complex numbers \(\mathbb{C}\) with the goal of noting that almost all of the linear algebra we have been doing can be done in exactly the same manner over \(\mathbb{C}\) as our previous efforts over \(\mathbb{R}\). We then discussed the following properties of complex numbers.

Properties of complex numbers. We let \(\mathbb{C}\) denote the number system of complex numbers, i.e., all numbers of the form \(a+bi\), with \(a,b\in \mathbb{R}\) and \(i^2 = -1\). Set \(z_1 := a+bi\) and \(z_2 := c+di\).

  1. (i) \(z_1 = z_2\) if and only if \(a = c\) and \(b = d\).
  2. (ii) By definition: \(z_1+z_2 = (a+c)+(b+d)i\) and \(z_1z_2 = (ac-bd)+(bc+ad)i\).
  3. (iii) Addition and multiplication of complex numbers enjoy all of the usual properties of arithmetic, e.g., addition and multiplication are commutative, multiplication distributes over addition, etc.
  4. (iv) Non-zero complex numbers have a multiplicative inverse, e.g., if \(z_1\not= 0\), then \(z_1^{-1} = \frac{a}{a^2+b^2} - \frac{b}{a^2+b^2}i\).
  5. (v) The complex conjugate of \(z_1\) is the complex number \(\overline{z_1} = a-bi\). Complex conjugation enjoys the following properties: \(\overline{z_1+z_2} = \overline{z_1}+\overline{z_2}\) and \(\overline{z_1z_2} = \overline{z_1}\cdot\overline{z_2}\).

We then stated the Very Important

Fundamental Theorem of Algebra. Every polynomial with real or complex coefficients has all of its roots in \(\mathbb{C}\). In particular, if \(f(x)\) is a monic polynomial of degree \(d\) with real coefficients, then there exist \(\alpha_1, \ldots, \alpha_d\in \mathbb{C}\) such that \(f(x) = (x-\alpha_1)\cdots(x-\alpha_d)\).

We noted the importance of this fact for linear algebra: The characteristic polynomial of any square matrix over \(\mathbb{R}\) has all of its roots in \(\mathbb{C}\), i.e., any such matrix has all of its eigenvalues in \(\mathbb{C}\). We then showed that any complex number has two square roots, meaning that we are free to solve degree two polynomials with complex coefficients using the quadratic formula.

We ended class by noting that almost all of the linear algebra we have done thus far, e.g., solving equations using Gaussian elimination, finding eigenvalues and eigenvectors, diagonalizing matrices, finding JCF, etc., can be done in exactly the same way over \(\mathbb{C}\) as we have been doing over \(\mathbb{R}\), with a minor difference arising from orthogonal diagonalization. We noted that over \(\mathbb{C}\) we have to alter the dot product of two vectors as follows: Suppose \(v_1 = \begin{pmatrix} \alpha\\\beta\end{pmatrix}\) and \(v_2 = \begin{pmatrix} \gamma\\\delta\end{pmatrix}\) with \(\alpha, \beta, \gamma, \delta\in \mathbb{C}\), then \(v_1\cdot v_2 := \alpha\overline{\gamma}+\beta\overline{\delta}\). This definition ensures that over \(\mathbb{C}\) the dot product of a non-zero vector with itself is non-zero.

Thursday, April 9

We began class by pointing out that virtually everything we have done this semester works in exactly the same way over \(\mathbb{C}\) as over \(\mathbb{R}\), except for issues related to orthogonality, since the dot product over \(\mathbb{C}\) involves complex conjugation. We illustrated this principle by working through the following examples.

Example 1. \(A = \begin{pmatrix} 0 & 1\\-1 & 0\end{pmatrix}\). We noted \(A\) has distinct eigenvalues \(i, -i\) and is therefore diagonalizable. We worked out the details in the usual way, the only difference being that the calculations took place in \(\mathbb{C}\).
Example 2. \(B = \begin{pmatrix} 0 & 4\\1 & 4i\end{pmatrix}\). This matrix has a repeated eigenvalue of \(2i\), and we worked through the usual process of finding \(P\) such that \(P^{-1}BP = \begin{pmatrix} 2i & 1\\0 & 2i\end{pmatrix}\), the Jordan canonical form of \(B\).
Example 3. \(A = \begin{pmatrix} 0 & i\\i & 1\end{pmatrix}\), so that \(A\) is a symmetric matrix. We saw that \(A\) is diagonalizable, since it has distinct eigenvalues. On the other hand, the eigenvectors are not orthogonal, so that \(A\) is not orthogonally diagonalizable, even though it is a symmetric matrix over \(\mathbb{C}\).
Example 4. \(B = \begin{pmatrix} 0 & -i\\i & 1\end{pmatrix}\). We saw that \(B = \overline{B}^t\), i.e., \(B\) equals its conjugate transpose. We then worked through the details showing that \(B\) is orthogonally diagonalizable over \(\mathbb{C}\), i.e., there is a diagonalizing matrix with orthogonal columns having length one. This followed since the eigenvectors for different eigenvalues in this case are orthogonal vectors in \(\mathbb{C}^2\).
Tuesday, April 14

The class worked in groups on practice problems for Exam 2.

Thursday, April 16

Exam 2.

Tuesday, April 21

We began a review of topics already covered, but in a general setting. Thus, we discussed the operations of sum and scalar multiplication for \(m\times n\) matrices and the usual properties that hold regarding these operations; the product of an \(m\times n\) matrix with an \(n\times p\) matrix and the usual arithmetic properties that hold; invertibility of \(n\times n\) matrices; and determinants of \(n\times n\) matrices. Regarding determinants, we first showed how to calculate the determinant of a \(3\times 3\) matrix by expanding along the first row or second column. Then we gave the following formulas:

Laplace Expansion Formulas. Let \(A = (a_{ij})\) be an \(n\times n\) matrix with entries in \(\mathbb{R}\) or \(\mathbb{C}\). Write \(A_{ij}\) for the \((n-1)\times(n-1)\) matrix obtained by removing the \(i\)th row and \(j\)th column of \(A\).

  1. (i) Expansion along the \(i\)th row: \(|A| = \sum_{j=1}^n (-1)^{i+j} a_{ij}|A_{ij}|\).
  2. (ii) Expansion along the \(j\)th column: \(|A| = \sum_{i=1}^n (-1)^{i+j} a_{ij}|A_{ij}|\).

We ended class by discussing the following properties — previously discussed for \(2\times 2\) matrices — noting that these can be used to simplify determinant calculations.

Properties of determinants. Let \(A\) be an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\).

  1. (i) If \(A'\) is obtained from \(A\) by multiplying a row or column of \(A\) by \(\lambda\in \mathbb{R}\) (or \(\mathbb{C}\)), then \(|A'| = \lambda\cdot |A|\).
  2. (ii) If \(A'\) is obtained from \(A\) by multiplying one row or column of \(A\) and adding it to another row or column of \(A\), then \(|A'| = |A|\).
  3. (iii) If \(A\) has a row or column consisting of zeroes, or two equal rows, or two equal columns, then \(|A| = 0\).
  4. (iv) If \(A\) is upper or lower triangular, then \(|A| = a_{11}\cdot a_{22}\cdots a_{nn}\).
  5. (v) If \(B\) is an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\), then \(|AB| = |A|\cdot |B|\).
  6. (vi) \(A\) is invertible if and only if \(|A| \not= 0\).
Thursday, April 23

We continued with revisiting concepts we have covered throughout the semester, but from a more general point of view, beginning with a corollary to item (vi) of the properties of determinants given in the previous lecture:

Corollary. Suppose \(A\) is an \(n\times n\) coefficient matrix for the system of linear equations \(A\cdot \begin{pmatrix} x_1\\\vdots\\x_n\end{pmatrix} = \begin{pmatrix} b_1\\\vdots\\b_m\end{pmatrix}\). If \(|A|\not= 0\), then the system has a unique solution: \(\begin{pmatrix} x_1\\\vdots\\x_n\end{pmatrix} = A^{-1}\cdot \begin{pmatrix} b_1\\\vdots\\b_m\end{pmatrix}\).

We then discussed the concepts of linear dependence and linear independence for column vectors in \(\mathbb{R}^n\):

Definition. Column vectors \(C_1, \ldots, C_r\in \mathbb{R}^n\) are said to be linearly dependent if the following equivalent statements hold:

  1. (i) We can write some \(C_i = \alpha_1 C_1+\cdots+\alpha_{i-1}C_{i-1}+\alpha_{i+1}C_{i+1}+\cdots+\alpha_r C_r\).
  2. (ii) There exists a non-trivial dependence relation \(\alpha_1 C_1+\cdots+\alpha_r C_r = 0\), with not all \(\alpha_j = 0\).

The vectors \(C_1, \ldots, C_r\) are linearly independent if they are not linearly dependent.

We pointed out the important facts that any set of vectors spanning \(\mathbb{R}^n\) must have at least \(n\) elements and any set of linearly independent vectors can have no more than \(n\) elements. Thus, a basis for \(\mathbb{R}^n\) always has \(n\) elements, where a basis is a collection of linearly independent vectors that span \(\mathbb{R}^n\).

We again stated the following key observation, in our general setting.

Key Observation. Let \(A\) be an \(m\times r\) matrix with columns \(C_1, \ldots, C_r\), and set \(v = \begin{pmatrix} \alpha_1\\\alpha_2\\\vdots\\\alpha_r\end{pmatrix}\). Then:

\[Av = \alpha_1 C_1+\alpha_2 C_2+\cdots+\alpha_r C_r.\]

As an immediate corollary we have the following fundamental theorem.

Theorem. Let \(A\) be as in the Key Observation. Then the homogeneous system of equations \(A\cdot \begin{pmatrix} x_1\\\vdots\\x_r\end{pmatrix} = \begin{pmatrix} 0\\\vdots\\0\end{pmatrix}\) has a nontrivial solution if and only if \(C_1, \ldots, C_r\) are linearly dependent. Equivalently, the system has a unique solution (necessarily the zero solution) if and only if \(C_1, \ldots, C_r\) are linearly independent.

We then discussed how the process of solving a system of \(m\) equations in \(n\) unknowns is exactly the same as for solving smaller systems of equations: one applies elementary row operations to the corresponding augmented matrix until the left hand side is in reduced row echelon form, where one can just read off the solutions. We ended class by writing down the solution set to a system of equations whose augmented matrix in reduced row echelon form was:

\[\begin{bmatrix} 1 & 0 & 5 & 0 & 2 & 0 & 7\\0 & 1 & 4 & 0 & -1 & 0 & 6\\0 & 0 & 0 & 1 & 5 & 0 & 7\\0 & 0 & 0 & 0 & 0 & 1 & 7\end{bmatrix}.\]

Using the variables \(x_1, \ldots, x_6\), this augmented matrix gives

\[\begin{align*} x_1 &= 7-5x_3-2x_5\\ x_2 &= 6-4x_3+x_5\\ x_3 &= x_3\\ x_4 &= 7-5x_5\\ x_5 &= x_5\\ x_6 &= 7. \end{align*}\]

Using the independent parameters \(t_1, t_2\), we saw that the solution set was given by

\[\left\{\begin{pmatrix} 7\\6\\0\\7\\0\\7\end{pmatrix} + t_1\cdot \begin{pmatrix} -5\\-4\\1\\0\\0\\0\end{pmatrix} + t_2\cdot \begin{pmatrix} -2\\1\\0\\-5\\1\\0\end{pmatrix}\ \bigg|\ t_1, t_2\in \mathbb{R}\right\}.\]

We finished by observing that if the same coefficient matrix was used for a homogeneous system of equations, then the vectors \(\begin{pmatrix} -5\\-4\\1\\0\\0\\0\end{pmatrix}\), \(\begin{pmatrix} -2\\1\\0\\-5\\1\\0\end{pmatrix}\) form a basis for the solution space.

Tuesday, April 28

We continued with revisiting concepts we have covered throughout the semester, but from a more general point of view, beginning with discussing the following:

Exchange Theorem. Let \(v_1, \ldots, v_r\) be linearly independent vectors in \(\mathbb{R}^n\) and suppose \(u_1, \ldots, u_m\) span \(\mathbb{R}^n\). Then \(r\leq m\) and after re-indexing the \(u_j\) if necessary, \(v_1, \ldots, v_r, u_{r+1}, \ldots, u_m\) span \(\mathbb{R}^n\). The same holds for vectors in \(\mathbb{C}^n\).

We then recalled that a collection of vectors in \(\mathbb{R}^n\) (or \(\mathbb{C}^n\)) is a basis if the vectors are both linearly independent and span \(\mathbb{R}^n\). We noted that the Exchange Theorem implies that all bases for \(\mathbb{R}^n\) (or \(\mathbb{C}^n\)) have \(n\) elements. We then stated the following general version of a theorem we have used throughout the semester for vectors in \(\mathbb{R}^2\) or \(\mathbb{R}^3\).

Theorem. Let \(v_1, \ldots, v_n\) be column vectors in \(\mathbb{R}^n\) or \(\mathbb{C}^n\), and let \(A\) denote the \(n\times n\) matrix whose columns are \(v_1, \ldots, v_n\). Then \(\{v_1, \ldots, v_n\}\) is a basis for \(\mathbb{R}^n\) or \(\mathbb{C}^n\) if and only if \(\det(A) \not= 0\).

We then began a discussion of eigenvalues and diagonalizability. Working over \(\mathbb{R}\) or \(\mathbb{C}\) we considered an \(n\times n\) matrix \(A\) with eigenvalue \(\lambda\). Writing \(p_A(x) = (x-\lambda)^e g(x)\), with \(g(\lambda) \not= 0\), we defined \(e\) to be the algebraic multiplicity of \(\lambda\). We defined \(E_\lambda\), the eigenspace of \(\lambda\), to be the solution space of the homogeneous system of equations with \(A-\lambda I_n\) as its coefficient matrix. Only the eigenspace terminology is new, as we have been studying this solution space for several weeks. Then we defined the geometric multiplicity of \(\lambda\) to be the number of independent parameters needed to describe the solutions in the eigenspace \(E_\lambda\).

We then stated the following important theorem.

Theorem. Let \(A\) be an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\), and \(\lambda\) an eigenvalue. Then the geometric multiplicity of \(\lambda\) is less than or equal to the algebraic multiplicity of \(\lambda\).

We gave a proof of this theorem in the special case that \(A\) is a \(3\times 3\) matrix over \(\mathbb{R}\) with eigenvalue \(\lambda\) such that \(\lambda\) has geometric multiplicity equal to 2.

We ended class by stating one of the most important theorems for our course, which has been illustrated by various examples we have looked at during all of our recent discussions involving diagonalizability and non-diagonalizability.

Diagonalizability Theorem. Let \(A\) be an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\). The following are equivalent.

  1. (i) \(A\) is diagonalizable.
  2. (ii) \(\mathbb{R}^n\) (respectively, \(\mathbb{C}^n\)) has a basis consisting of eigenvectors for \(A\).
  3. (iii) \(p_A(x) = (x-\lambda_1)^{e_1}\cdots(x-\lambda_r)^{e_r}\) and for each \(\lambda_i\), the geometric multiplicity of \(\lambda_i\) equals the algebraic multiplicity of \(\lambda_i\).
Thursday, April 30

We began class by re-stating and discussing the Diagonalizability Theorem presented at the end of the previous lecture. We noted that the proof of the theorem (which we did not give) depends on the following

Important Fact. Let \(A\) be an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\). If \(\lambda_1, \ldots, \lambda_r\) are distinct eigenvalues with corresponding eigenvectors \(v_1, \ldots, v_r\), then \(v_1, \ldots, v_r\) are linearly independent.

Using the theorem we were able to derive

Corollary. If \(A\) is an \(n\times n\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\) and \(A\) has \(n\) distinct eigenvalues (in \(\mathbb{R}\) or \(\mathbb{C}\), respectively), then \(A\) is diagonalizable.

We then presented the following special case of the crucial (iii) implies (i) in the Diagonalizability Theorem.

Special Case. Suppose \(A\) is a \(3\times 3\) matrix over \(\mathbb{R}\) or \(\mathbb{C}\) such that \(p_A(x) = (x-\lambda_1)^2(x-\lambda_2)\), with \(\lambda_1\not= \lambda_2\). If the algebraic multiplicities of \(\lambda_1\) and \(\lambda_2\) equal their geometric multiplicities, then \(A\) is diagonalizable.

Proof. Suppose \(v_1, v_2\in E_{\lambda_1}\) are linearly independent eigenvectors for \(\lambda_1\) and \(w\in E_{\lambda_2}\) is an eigenvector for \(\lambda_2\). We first note that \(v_1, v_2, w\) are linearly independent. To see this, suppose \(av_1+bv_2+cw = 0\). We need to see that \(a = b = c = 0\). Suppose \(c\not= 0\), then we can write

\[w = -\frac{a}{c}\cdot v_1-\frac{b}{c}\cdot v_2,\]

showing that \(w\in E_{\lambda_1}\). But \(w\) is an eigenvector for \(\lambda_2\), which is a contradiction, since \(\lambda_1\not= \lambda_2\). Thus \(c = 0\). Since \(v_1, v_2\) are linearly independent by assumption, \(a = 0 = b\), showing that \(v_1, v_2, w\) are linearly independent. Since we have three linearly independent vectors in \(\mathbb{R}^3\), they form a basis for \(\mathbb{R}^3\) and thus \(P = [v_1\ v_2\ w]\) is invertible. Thus, we have

\[\begin{align*} AP &= A[v_1\ v_2\ w] = [Av_1\ Av_2\ Aw] = [\lambda_1 v_1\ \lambda_1 v_2\ \lambda_2 w]\\ &= [v_1\ v_2\ w]\cdot\begin{pmatrix}\lambda_1 & 0 & 0\\0 & \lambda_1 & 0\\0 & 0 & \lambda_2\end{pmatrix} = P\cdot\begin{pmatrix}\lambda_1 & 0 & 0\\0 & \lambda_1 & 0\\0 & 0 & \lambda_2\end{pmatrix}. \end{align*}\]

and thus \(P^{-1}AP = \begin{pmatrix}\lambda_1 & 0 & 0\\0 & \lambda_1 & 0\\0 & 0 & \lambda_2\end{pmatrix}\), showing that \(A\) is diagonalizable. \(\square\)

We noted that the key to the proof was that the existence of two independent vectors in \(E_{\lambda_1}\) enabled us to create the invertible matrix \(P\). If \(E_{\lambda_1}\) had only one independent eigenvector, we could not have created a \(3\times 3\) matrix of eigenvectors with linearly independent columns.

We then ended the discussion of general statements of topics covered this semester by stating

Spectral Theorem for Real Matrices. Let \(A\) be an \(n\times n\) matrix with entries in \(\mathbb{R}\). Then \(A\) is symmetric if and only if \(A\) is orthogonally diagonalizable, i.e., there exists \(Q\in \mathrm{M}_n(\mathbb{R})\) such that \(Q^{-1}AQ\) is diagonal and \(Q\) is an orthogonal matrix, i.e., if \(C_1, \ldots, C_n\) are the columns of \(Q\), then \(C_i\cdot C_j = 0\), for \(i\not= j\) and \(\|C_j\| = 1\), for all \(1\leq j\leq n\).

We then began a discussion of vector spaces in the abstract, starting with the definition below. We agreed to write \(F\) to indicate either \(\mathbb{R}\) or \(\mathbb{C}\).

Definition. A vector space over \(F\) is a set \(V\) together with addition and scalar multiplication satisfying the following axioms:

  1. (i) An additive identity exists: There exists \(\vec{0}\in V\) such that \(v+\vec{0} = v = \vec{0}+v\), for all \(v\in V\).
  2. (ii) Additive inverses exist: Given \(v\in V\), there exists \(-v\in V\) such that \((-v)+v = \vec{0} = v+(-v)\).
  3. (iii) Addition is commutative: \(v_1+v_2 = v_2+v_1\), for all \(v_i\in V\).
  4. (iv) Addition is associative: \(v+(u+w) = (v+u)+w\), for all \(u,v,w\in V\).
  5. (v) Scalar multiplication distributes over addition: \(\lambda\cdot(v+w) = \lambda v+\lambda w\), for all \(\lambda\in F\) and \(v,w\in V\).
  6. (vi) Scalar addition is distributive: \((\lambda_1+\lambda_2)\cdot v = \lambda_1\cdot v+\lambda_2\cdot v\), for \(\lambda_1,\lambda_2\in F\) and \(v\in V\).
  7. (vii) Scalar multiplication is associative: \((\lambda_1\lambda_2)\cdot v = \lambda_1\cdot(\lambda_2\cdot v)\), for all \(\lambda_1,\lambda_2\in F\) and \(v\in V\).
  8. (viii) \(1\cdot v = v\) and \(0\cdot v = \vec{0}\), for all \(v\in V\).

We ended class by giving the following examples and discussing how they are all similar, in terms of their underlying vector space structure.

Examples of vector spaces. The following are vector spaces over the indicated number system.

  1. (i) Column vectors in \(\mathbb{R}^2\), \(\mathbb{R}^3\), \(\mathbb{R}^n\), or \(\mathbb{C}^n\).
  2. (ii) \(P_n(F)\), polynomials of degree less than or equal to \(n\) over \(F\).
  3. (iii) \(\mathrm{M}_n(F)\), \(n\times n\) matrices over \(F\).
  4. (iv) The set of continuous functions from \([a,b]\) to \(\mathbb{R}\).